WD-insert-951201

Inserting multimedia objects into HTML3

W3C Working Draft 01-Dec-95

This version:: http://www.w3.org/pub/WWW/TR/WD-insert-951201
Latest version:: http://www.w3.org/pub/WWW/TR/WD-insert
Editor:: Dave Raggett <dsr@w3.org>
Authors:: Charlie Kindel, Microsoft Corporation
Lou Montulli, Netscape Communications Corp.
Eric Sink, Spyglass Inc.
Wayne Gramlich, Sun Microsystems
Jonathan Hirschman, Pathfinder
Tim Berners-Lee, W3C
Dan Connolly, W3C

Status of this document

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at: http://www.w3.org/pub/WWW/TR

Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather than the URLs for working drafts themselves.

Abstract

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. This specification extends HTML to support the insertion of multimedia objects including Microsoft Component Object Model (COM) objects (e.g. OLE Controls and OLE Document embeddings), Java applets, and a wide range of other media plugins. The approach allows objects to be specified in a general manner and provides the ability to override the default implementation of objects.

Introduction
The INSERT tag
Decision Tree for binding objects
The PARAM tag
The ALIAS tag
Further work
The DTD

Introduction

HTML 2.0 defined only a single mechanism for inserting media into HTML documents: the IMG tag. While this tag has certainly proved worthwhile, the fact that it is restricted to image media severely limits it usefulness as richer and richer media finds its way onto the Web.

Developers have been experimenting with ideas for dealing with new media: Microsoft's DYNSRC attribute for video and audio, Netscape's EMBED tag for compound document embedding, and Sun's APP and APPLET tags for executable code.

Each of these proposed solutions attacks the problem from a slightly different perspective, and on the surface are each very different. In addition, each of these proposals falls short, in one way or another, of meeting the requirements of the Web community as a whole. However, we believe that this problem can be addressed with a single extension that addresses all of the current needs, and is fully extensible for the future.

This specification defines a new tag <INSERT> which subsumes the role of the IMG tag, and provides a general solution for dealing with new media, while providing for effective backwards compatibility with existing browsers. INSERT allows the HTML author to specify the data, including persistent data and/or properties/parameters for initializing objects to be inserted into HTML documents, as well as the code that can be used to display/manipulate that data. Here, the term object is used to describe the things that people want to place in HTML documents, but other terms for these things are: components, applets, plugins, media handlers, etc.

The data can be specified in one of several ways: via a universally unique object identifier (uuid) (<<REFERENCE OSF/DCE RPC Specification>>), a file specified by a URL, in-line data, or as a set of named properties. In addition, there are a number of attributes that allow authors to specify standard properties such as width, and height. The code for the object is specified in several ways: indirectly by the object's uuid, by information included as part of the object's data, and the combination of an object class name and a network address.

This specification covers the syntax and semantics for inserting such objects into HTML documents, but leaves out the architectural and application programming interface issues for how objects communicate with the document and other objects on the same page. It is anticipated that future specifications will cover these topics, including scripting languages and interfaces.

An introduction to the INSERT tag

This section is intended to help readers get the feel of the insertion mechanism, and is not a normative part of the specification. The INSERT tag provides a richer alternative to the IMG tag. It may be used when the author wishes to provide an alternative for user agents that don't support a particular media. A simple example of using INSERT is:

    <insert data=TheEarth.avi type="application/avi">
    <param name=loop value=infinite>
    <img src=TheEarth.gif alt="The Earth">
    </insert>

Here the user agent would show an animation if it supports the AVI format, otherwise it would show a GIF image. The IMG element is used for the latter as it provides for backwards compatibility with existing browsers. The TYPE attribute allows the user agent to quickly detect that it doesn't support a particular format, and hence avoid wasting time downloading the object. Another motivation for using the TYPE attribute is when the object is loaded off a CD-ROM, as it allows the format to be specified directly rather than being inferred from the file extension.

The next example inserts an OLE control for a clock:

    <insert
       id=clock1
	type="application/x-oleobject"
       data="http://www.foo.bar/test.stm"
       code="http://www.foo.bar/controls"
    >
    </insert>

The ID attribute allows other controls on the same page to locate the clock. The DATA attribute points to the persistent stream data used to initialize the object's state. It includes a class identifier. The CODE attribute points to a file containing the implementation for this object. The file may contain the code for several classes, but this can be resolved by the class id from the object's data stream.

In the absence of the CODE attribute, the class identifier may be sufficient to locate the code implementing this object. User agents may provide a range of mechanisms for locating and downloading such code. For some formats such as image files, the Internet media type returned with the data is sufficient.

The class identifier can be specified explicitly using the CLASSID attribute. This value takes precedence over a class identifier included as part of the object's data, e.g.

    <insert
       id=clock1
	type="application/x-oleobject"
       classid="uuid:{663C8FEF-1EF9-11CF-A3DB-080036F12502}"
       data="http://www.acme.com/ole/clock.stm"
    >
    </insert>

Are the curly brackets really needed for guid's?

Yes. Standard string representation for uuid includes the brackets. -cek

For speedy loading of objects you can inline the object's state data using the new URL scheme "data:", e.g.

    <insert
       id=clock1
       classid="uuid:{663C8FEF-1EF9-11CF-A3DB-080036F12502}"
       data="data:34hqi6n3gs9c3hdish2h568fhsb3uds7b4jawkl5h"
       type="application/x-oleobject; clsid=no"
    >
    </insert>

The data is expressed as a Base64 encoded byte stream. The interpretation of this stream is class dependent. If the CLASSID attribute is missing or is insufficient to disambiguate the precise format of this stream then the TYPE attribute may be used to resolve matters. In the example, the Internet media type for COM streams takes a parameter that indicates that the stream doesn't start with a class identifier.

The next example is a Java applet:

    <insert
       code="http://java.acme.com/applets/NervousText.class"
       width=400
       height=75
       align=center
    >
    <param name=text value="This is the Applet Viewer">
    </insert>

The CODE attribute points to the file containing the implementation of the NervousText class. There is only one class per file, so this is unambigous. The other attributes on the INSERT element define rendering properties of the container for the applet viewer. The PARAM element specifies a named property which is used to initialize the class. PARAM elements can be combined with data streams for greater control.

A walk through the DTD

The document type definition provides the formal definition of the allowed syntax for HTML inserts. The following is an annotated listing of the DTD. The complete listing appears at the end of this document.

Standard Units for Lengths

Several attributes specify lengths as a number followed by an optional suffix. The units for lengths are specified by the suffix: pt denotes points, pi denotes picas, in denotes inches, cm denotes centimeters, mm denotes millimeters, em denotes em units (equal to the height of the default font), and px denotes screen pixels. The % sign indicates that the value is a percentage of the current displayable region, for widths, this is the space between the current left and right margins, while for heights, this is the height of the current window or table cell etc. The default units are screen pixels. The number is an integer value or a real valued number such as "2.5". Exponents, as in "1.2e2", are not allowed. White space is not allowed between the number and the suffix.

The INSERT Tag

The INSERT element requires both start and end tags. The INSERT element has the same content model as the HTML BODY element, except that one or more optional PARAM or ALIAS elements can be placed immediately after the INSERT start tag and used to initialize the inserted object. The content of the INSERT element is rendered if the object specified by the data, code or classid attributes can't be rendered. This provides for backwards compatibility with existing browsers, and allows authors to specify alternative media via nested INSERT elements.

Note that this doesn't provide the same level of flexibility as would be provided by a richer description of resource variants. For instance when a resource in available are several media types and for each such type in English, Spanish, French and German.

<!-- Content model entities imported from parent DTD:

  %body.content allows inserts to contain headers, paras,
  lists, form elements and even arbitrarily nested inserts.
-->

<!ENTITY % attrs
       "id      ID       #IMPLIED  -- element identifier --
        class   NAMES    #IMPLIED  -- for subclassing elements --
        style   CDATA    #IMPLIED  -- rendering annotation --
        dir   (ltr|rtl)  #IMPLIED  -- I18N text direction --
        lang    NAME     #IMPLIED  -- as per RFC 1766 --">
        
<!ENTITY % URL "CDATA" -- universal reference locator -->
<!ENTITY % Align "(top|middle|bottom|left|center|right)">
<!ENTITY % Length "CDATA" -- standard length value -->

<!-- INSERT is a character-like element for inserting objects -->
<!ELEMENT insert - - ((param|alias)*, bodytext)>
<!ATTLIST insert
        %attrs      -- id, class, style, lang, dir --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        type    CDATA    #IMPLIED   -- Internet media type --
        align   %Align   #IMPLIED   -- positioning inside document --
        height  %Length  #IMPLIED   -- suggested height --
        width   %Length  #IMPLIED   -- suggested width --
        border  %Length  #IMPLIED   -- suggested link border width --
        hspace  %Length  #IMPLIED   -- suggested horizontal gutter --
        vspace  %Length  #IMPLIED   -- suggested vertical gutter --
        usemap  %URL     #IMPLIED   -- ref to image map --
        ismap   (ismap)  #IMPLIED   -- use server image map --
        >

<!-- the BODYTEXT element is needed to avoid problems with
      SGML mixed content, but is never used in actual documents -->
<!ELEMENT bodytext O O %body.content>

In general, all attribute names and values in this specification are case insensitive, except where noted otherwise. INSERT has the following attributes:

ID: Used to define a document-wide identifier. This can be used for naming positions within documents as the destination of a hypertext link. It may also be used by style sheets for rendering an element in a unique style. An ID attribute value is an SGML NAME token. NAME tokens are formed by an initial letter followed by letters, digits, "-" and "." characters. The letters are restricted to A-Z and a-z. It may also be used by the user agent or other objects in the document to find and communicate with objects on the document.
CLASS: A space separated list of SGML NAME tokens. CLASS names specify that the element belongs to the corresponding named classes. These may be used by style sheets to provide class dependent renderings.
LANG: A LANG attribute identifies the natural language used by the content of the associated element.The syntax and registry of language values are defined by RFC 1766. In summary the language is given as a primary tag followed by zero or more subtags, separated by "-". White space is not allowed and all tags are case insensitive. The name space of tags is administered by IANA. The two letter primary tag is an ISO 639 language abbreviation, while the initial subtag is a two letter ISO 3166 country code. Example values for LANG include:

      en, en-US, en-uk, i-cherokee, x-pig-latin.

DIR: Human writing systems are grouped into scripts, which determine amongst other things, the direction the characters are written. Elements of the Latin script are nominally left to right, while those of the Arabic script are nominally right to left. These characters have what is called strong directionality. Other characters can be directionally neutral (spaces) or weak (punctuation).
The DIR attribute specifies an encapsulation boundary which governs the interpretation of neutral and weakly directional characters. It does not override the directionality of strongly directional characters. The DIR attribute value is one of LTR for left to right, or RTL for right to left, e.g. DIR=RTL.
STYLE: The STYLE attribute allows you to include rendering information in a notation specified with the STYLE element in the document head, The default notation is W3C's CSS. W3C has produced a separate specifcation on how to associate HTML documents with rendering information in different notations, see W3C-style.
DATA: Specifies a URL referencing the object's data. This could be a GIF file or the pickled data representing an object's state. In many cases the media type or the data itself contains sufficient information to identify what code is needed to initialize the object. Note that an object's data can even be included inline for super efficient loading. This specification proposes a new URL scheme "data:". The rest of the URL is a base64 encoded character string that specifies the object's data as an opaque byte stream.
On its own, this would be meaningless. If the DATA attribute appears without a CODE or CLASSID attribute, then a TYPE attribute may be sufficient to interpret the data. For instance a Microsoft COM object can be asked to write its state using the WriteClassStream procedure. This inserts the object's class id as the first 16 bytes of the stream. If the TYPE attribute indicates that the data is in the COM persistent stream format, then the class id can be retrieved from the DATA attribute and used to find the code implementing the object's behaviour.
CODE: This specifies a URL referencing the code which implements the object's behaviour. In many cases, files will contain the code for several object classes. In this situtation a URL fragment identifier can be used to name the code for the specific class within the file.
CLASSID: This can be used to specify a universally unique object identifier (uuid). This allows effective use of caching, as the user agent can use simple string comparison to check whether two objects are the same independent of their location. The CLASSID attribute value takes the form of a URL style prefix separated by a colon from the character string defining the uuid. The prefix is used to identify the global name space for the uuid, for example classid="uuid:{663C8FEF-1EF9-11CF-A3DB-080036F12502}" gives the uuid for a Microsoft COM object, using the UUID name space.
The CLASSID may be sufficient for the user agent to locate the code implementing the object. However, the CODE attribute can also be used with CLASSID to provide a hint as to where to look for this code. Note that the value specified with CLASSID takes precedence over values derived from the object's data stream. Is this really approriate? (Yes -cek)
The CLASSID attribute can be used to override the default implementation as implied by the DATA attribute. For example, you may have the pickled data for an Excel spread sheet but want to view it with the "SuperGraph" package. You would then use the DATA attribute to point to the Excel spreadsheet data, and the CLASSID attribute to point to the SuperGraph plugin.
TYPE: This specifies an Internet Media Type (see RFC 1590) for the object's data. The attribute can be used to allow user agents to quickly skip media they don't support, and instead to render the contents of the INSERT element. It is also useful when loading objects off local drives as it allows the media type to be specified explicitly rather than being derived from the file extension.
The following grammar for media types is a superset of that for MIME because it does not restrict itself to the official IANA and x-token types.

       media-type     = type "/" subtype *( ";" parameter )
       type           = token
       subtype        = token

where token is defined by:

       token          =  1*<any (ASCII) CHAR except SPACE, CTLs, or tspecials>

       tspecials      = <one of the set>   ( ) < > @ , ; : \ " / [ ] ? =

Parameters may follow the type/subtype in the form of attribute/value pairs.

       parameter      = attribute "=" value
       attribute      = token
       value          = token | quoted-string

The type, subtype, and parameter attribute names are case-insensitive. Parameter values may or may not be case-sensitive, depending on the semantics of the parameter name. White space characters must not be included between the type and subtype, nor between an attribute and its value. If a given media-type value has been registered by the IANA, any use of that value must be indicative of the registered data format. Although HTML allows the use of non-registered media types, such usage must not conflict with the IANA registry. Data providers are strongly encouraged to register their media types with IANA via the procedures outlined in RFC 1590. All media-type's registered by IANA must be preferred over extension tokens. However, HTML does not limit applications to the use of officially registered media types, nor does it encourage the use of an "x-" prefix for unofficial types outside of explicitly short experimental use between consenting applications.
ALIGN: This determines where to place the object. The ALIGN attribute allows objects to be placed as part of the current text line, or as a distinct unit, aligned to the left, center or right.
For ALIGN=TOP, the top of the object is vertically aligned with the top of the tallest text for the current line.
For ALIGN=MIDDLE, the middle of the object is vertically aligned with the position midway between the baseline and the x-height for the text line in which the object appears. The x-height is defined as the top of a lower case x in western writing systems. If the text font is an all caps style then use the height of a capital X. For other writing systems, align the middle of the object with the middle of the text.
For ALIGN=BOTTOM, the bottom of the object is vertically aligned with the baseline of the text line in which the object appears.
For ALIGN=CENTER, the object moved down to the next line and centered between the left and right margins. Subsequent text starts at the beginning of the next line.
For ALIGN=LEFT, the object is moved down and over to the current left margin. Subsequent text is flowed past the right hand side of the visible area of the object. For ALIGN=RIGHT, the object is moved down and over to the current right margin. Subsequent text is flowed past the left hand side of the visible area of the object.
WIDTH: This gives the suggested width of a box enclosing the visible area of the object. The width is specified in standard units.
HEIGHT: This gives the suggested height of a box enclosing the visible area of the object. The width is specified in standard units.
BORDER: This attribute applies to the border shown when the object forms part of a hypertext link, as specified by an enclosing anchor element. The attribute specifies the suggested width of this border around the visible area of the object. The width is specified in standard units. For BORDER=0 no border should be shown. This is normally used when such a border would interfere with the visual affordances presented by the object itself. For instance, the object could render itself as a number of bevelled buttons.
HSPACE: The suggested width of the space to the left and right of the box enclosing the visible area of the object. The width is specified in standard units. This attribute is used to alter the separation of preceding and following text from the object.
VSPACE: The suggested height of the space to the top and bottom of the box enclosing the visible area of the object. The height is specified in standard units.
USEMAP: This specifies a universal resource identifier for a client-side image map in the format proposed by Spyglass Inc. This is normally appropriate only for static images.
ISMAP: When the INSERT element appears within a hypertext link, this attribute indicates that the server provides an image map, so that mouse clicks should be sent to the server in the same manner as for the IMG element. This is normally appropriate only for static images.

Decision Tree for Binding Objects

This section defines the steps needed to bind an object given various combinations of attributes. Without a precise semantics for this, different user agents would otherwise produce different results to the frustration of authors.

Note is this decision tree over specified? Is it correct? Is there a simpler explanation in terms of precedence rules? Please let the editor know if you have a proposal for this!

In the following, the ExtractClassId function is somewhat simplified. This function depends on recognising the format of the data stream. If the data is inlined, the format may be given by the TYPE attribute or derived from the CLASSID or CODE attributes. The format of the data stream also effects the Invoke function. For instance, the class initialization procedure may require you to adjust the stream pointer to after the class identifier.

This decision tree doesn't make explicit how to deal with the cases where you use the CODE attribute to load the implementation, but need additional information to find the appropriate entry point into this code. This could be provided by the CLASSID attribute or by a class identifier extracted from the data as pointed to by the DATA attribute.

    Properties(properties);  // collect properties from PARAM elements
    Type(type);              // get value of TYPE attribute

    if CLASSID(clsid) then
        if Implementation(clsid, class) then
            if (DATA(data)) then
                if (Load(data, stream)) then
                    Invoke(class, stream, params);
                else
                    Fail("Can't load data");
                fi
            else // no data
                Invoke(class, null);
            fi
        else // no implementation for CLASSID
            if CODE(code) then
                if Load(code, class) then
                    if DATA(data then
                        if Load(data, stream) then
                            Invoke(class, stream, params);
                        else
                            Fail("Can't load data");
                        fi
                    else
                        Invoke(class, null, params);
                    fi
                else   // can't load code
                    if DATA(data) then
                        if Load(data, stream) then
                            if ExtractClassId(stream, clsid) then
                                if Implementation(clsid, class) then
                                    Invoke(class, stream, params);
                                else
                                    Fail("Can't get implementation");
                                fi
                            else
                                Fail("Can't get implementation");
                            fi
                        else
                            Fail("Can't get implementation");
                        fi
                    else  // no data
                        Fail("Can't get implementation");
                    fi
                fi
            else // no CODE attribute
                if DATA(data) then
                        if Load(data, stream) then
                            if ExtractClassId(stream, clsid) then
                                if Implementation(clsid, class) then
                                    Invoke(class, stream, params);
                                else
                                    Fail("Can't get implementation");
                                fi
                            else
                                Fail("Can't get implementation");
                            fi
                        else
                            Fail("Can't load data");
                        fi
                else
                    Fail("Can't get implementation");
                fi
            fi
        fi
    else // no CLASSID
        if CODE(code) then
            if Load(code, class) then
                if DATA(data) then
                    if Load(data, stream) then
                        Invoke(class, stream, params);
                    else
                        Fail("Can't load data");
                    fi
                else // no DATA
                    Invoke(class, null, params);
                fi
            else
                Fail("Can't get implementation");
            fi
        else // no CODE
            if DATA(data) then
                if Load(data, stream) then
                    if ExtractClassID(stream, clsid) then
                        if Implementation(clsid, class) then
                            Invoke(class, stream, params);
                        else
                            Fail("Can't get implementation");
                        fi
                    else
                        Fail("Can't get implementation");
                    fi
                else
                    Fail("Can't get implementation");
                fi
            else
                Fail("Can't get implementation");
            fi
        fi
    fi

The PARAM element

The PARAM element allows a list of named property values (used to initialize a OLE control, plug-in module or Java applet) to be represented as a sequence of PARAM elements. Note that PARAM is an empty element and should appear without an endtag.

<!ELEMENT param - O EMPTY -- named property value -->
<!ATTLIST param
        name    CDATA    #REQUIRED  -- property name --
        value   CDATA    #IMPLIED   -- property value --
        valueref  %URL   #IMPLIED   -- ref to object ALIAS --
        >

The NAME attribute defines the property name. The case sensitivity of the name is dependent on the code implementing the object.

The value attribute is used to specify the property value. It is an opaque character string whose meaning is determined by the object based on the property name. Note that CDATA attribute values need characters such as & to be escaped using the standard SGML character entities. It is also essential to escape the > character to defend against incorrect handling by many existing browsers.

The valueref attribute is used when the property is an object. A distinct attribute is needed as in some cases the property type cannot be deduced from the property name. The valueref attribute provides a URL based reference to an ALIAS element that defines the object itself.

Note do we want a short cut in which valueref specifies the object directly rather than via an alias? For example valueref=foo.gif, or when you want to use inline data with the "data:" URL scheme? This would save having to include an associated ALIAS element, but might require adding a TYPE attribute to the PARAM element.

(I like this idea! "Valueref=data:abcdef00...b34f#name=arial size=10 bold=true". Where the data is just the CLSID. -cek)

The ALIAS element

The ALIAS element is used to define an object without inserting it into the document. It is used with the valueref attribute of the PARAM element to allow an object to be passed as parameter, when initializing an object associated with another INSERT or ALIAS element. The attributes take exactly the same meaning as for the INSERT element. The ALIAS element is a container and requires both start and end tags. The contents are limited to PARAM and ALIAS elements, although it is anticipated that this may be extended to cover the same content model as INSERT at some point in the future.

<!-- ALIAS is allowed in document HEAD and BODY
   it defines an alias for an object without inserting it -->
<!ELEMENT alias - - (param*, alias*)>
<!ATTLIST alias
        id      ID       #REQUIRED  -- defines name for alias --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        >

Note that the object isn't created until its needed by something that points to it. Each such reference creates a separate copy of the object.

Note What should the user agent do if it can't support a particual aliased object? One answer would be for it to recover and try the contents of the ALIAS element. To allow this to work I have changed the content model for ALIAS from (param|alias)*.

(Good -cek)

Further Work

This section describes proposals for extending the capabilities of the insertion mechanism as an encouragement and guide to developers wishing to experiment with such features. These ideas are not part of the current specification, and support is not required for conformance with this specification.

Document Background

Using a GIF image to tile the document background often results in significant delays while the image tile is downloaded. The ability to use a small Java applet or OLE Control to generate the image tile would allow rich background textures and patterns to be used without causing significant delay. As processing speeds increase, the ability to use an object to generate the background would make practical animated backgrounds.

The proposed extension is to allow the BACKGROUND attribute of the BODY element to reference ALIAS elements, for example:

<title>Demo Document</title>
<alias id=marble code="http://www.acme.com/applets/marble.class">
</alias>
<body background="#marble">
<p>This document has a marble texture generated by a Java applet.

Overlays

Overlays are useful for reducing network bandwidth needs. For instance, you can place a PNG overlay on top of a JPEG image. If the PNG image is an antialiased text overlay while the JPEG image is a photographic image with a high compression factor, then the two images will take significantly less time to send than a single image combining both layers. Selecting the format and compression for each layer separately allows you to get higher compression for the same level of quality.

Overlays also save time by making caching more effective. For instance you might send a large image on one page, and then make small changes to it on subsequent pages. Using an overlay allows the original large image to be reused, so that only the small changes need to be sent with each successive page.

<!ELEMENT overlay - O EMPTY -- image overlay -->
<!ATTLIST overlay
        id          ID          #IMPLIED  -- for naming this overlay --
        class       NAMES       #IMPLIED  -- for subclassing element --
        style       CDATA       #IMPLIED  -- for attaching style info --
        x           %Length     #IMPLIED  -- offset from left of parent --
        y           %Length     #IMPLIED  -- offset from top of parent --
        width       %Length     #IMPLIED  -- suggested width --
        height      %Length     #IMPLIED  -- suggested height --
        src         %URL        #IMPLIED  -- network address of object --
        >

For instance, here is a road map overlayed on an aerial photograph:

    <insert data="photo.jpeg">
        <overlay src=grid.png>
    </insert>

The SRC attribute of the OVERLAY element could be used together with an ALIAS element. This allows you to create overlays from OLE controls or Java applets.

Non-Rectangular and Resizable Objects

Many objects will size themselves according to their contents. Another popular feature is likely to be the ability for users to dynamically resize objects, e.g. by dragging size bars. The height and width attributes of the INSERT element can be used as suggested initial values. For instance, images can be automatically resized to match these values. The ability to smoothly magnify an image allows a small image file to fill a large space, and saves network time.

This specification limits objects to a rectangular outline. Of course, the object can render itself with a transparent background to give the effect of a shaped object, but any text flowing past would always follow the rectangular frame around the object.

An obvious extension is to allow the user agent to ask the object for its outline, and to flow text around that outline. To speed up display, it will also be useful to be able to specify an explicit outline as an attribute of the INSERT element, in the same spirit as the width and height attributes. It is strongly suggested that this attribute is called "SHAPE" and that it has exactly the same syntax as the client-side image map proposal.

Figure Captions

A time proven idiom for document layout is the figure. This is often an illustration, but may contain textual material separate from the main flow of the document. Figures are typically captioned and floated to between columns or to the top or bottom of a page. It is not uncommon to see separate contents lists for figures and tables, in addition to the main table of contents.

It is proposed that the FIG element is used to create captioned figures:

    <!ENTITY % f.align "(left|center|right)">
        
    <!ELEMENT fig - - (caption?, bodytext)>
    <!ATTLIST fig
        %attrs                      -- id, class, style, lang, dir --
        align   %f.align #IMPLIED   -- position on page --
        height  %Length  #IMPLIED   -- suggested height --
        width   %Length  #IMPLIED   -- suggested width --
        >
        
    <!ENTITY % c.align "(top|bottom|left|right)">
        
    <!ELEMENT caption - - %body.content>
    <!ATTLIST caption
        %attrs                      -- id, class, style, lang, dir --
        align   %c.align #IMPLIED   -- position relative to figure --
        >

For example:

    <fig>
        <caption>Mount Washington</caption>
        <insert data=http://www.acme.com/images/vista.jpeg>
        <p>A spectacular view of Mount Washington during a winter sunset.
        </insert>
    </fig>

Note that FIG is a block-like element similar to tables. If a user agent supports tables then adding support for figures is quite simple, since the FIG element behaves in the same as a table with a caption and a single cell. Of course some would argue that in this case, why not use the TABLE element. This is an example of approaching HTML with a view to getting a desired visual effect without regard to what the markup means. This makes it hard to export HTML to other document formats, and makes it harder to read the markup, as you now have to guess what the author actually was trying to do.

Credits and Copyright

Image such as GIF and PNG formats allow textual messages to be included with the image data. This provides an appropriate place to store copyright messages since these will then be automatically transferred with the image when it is dragged from the document to the desktop etc. For other formats, copyright notices can be included in HTTP headers, or by wrapping up the objects as MIME multipart files.

Even when information can be included as part of the object's data, it is still useful to present a credit or copyright message to the user as part of the document text. Something immediately visible is more effective that something hidden! The suggested way of handling credits is to use a new character emphasis element CREDIT that can be given as part of the FIG contents, e.g.

    <fig>
        <insert data=http://www.acme.com/images/vista.jpeg>
        <p>A spectacular view of Mount Washington during a winter sunset.
        </insert>
        <credit>John Smith</credit>
    </fig>

HTML Inserts DTD

The DTD or document type definition provides the formal definition of the allowed syntax for HTML inserts.

<!-- Content model entities imported from parent DTD:

  %body.content allows inserts to contain headers, paras,
  lists, form elements and even arbitrarily nested inserts.
-->

<!ENTITY % attrs
       "id      ID       #IMPLIED  -- element identifier --
        class   NAMES    #IMPLIED  -- for subclassing elements --
        style   CDATA    #IMPLIED  -- rendering annotation --
        dir   (ltr|rtl)  #IMPLIED  -- I18N text direction --
        lang    NAME     #IMPLIED  -- as per RFC 1766 --">
        
<!ENTITY % URL "CDATA" -- universal reference locator -->
<!ENTITY % Align "(top|middle|bottom|left|center|right)">
<!ENTITY % Length "CDATA" -- standard length value -->

<!-- INSERT is a character-like element for inserting objects -->
<!ELEMENT insert - - ((param|alias)*, bodytext)>
<!ATTLIST insert
        %attrs      -- id, class, style, lang, dir --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        type    CDATA    #IMPLIED   -- Internet media type --
        align   %Align   #IMPLIED   -- positioning inside document --
        height  %Length  #IMPLIED   -- suggested height --
        width   %Length  #IMPLIED   -- suggested width --
        border  %Length  #IMPLIED   -- suggested link border width --
        hspace  %Length  #IMPLIED   -- suggested horizontal gutter --
        vspace  %Length  #IMPLIED   -- suggested vertical gutter --
        usemap  %URL     #IMPLIED   -- ref to image map --
        ismap   (ismap)  #IMPLIED   -- use server image map --
        >

<!-- the BODYTEXT element is needed to avoid problems with
      SGML mixed content, but is never used in actual documents -->
<!ELEMENT bodytext O O %body.content>

<!ELEMENT param - O EMPTY -- named property value -->
<!ATTLIST param
        name    CDATA    #REQUIRED  -- property name --
        value   CDATA    #IMPLIED   -- property value --
        valueref  %URL   #IMPLIED   -- ref to object ALIAS --
        >

<!-- ALIAS is allowed in document HEAD and BODY
   it defines an alias for an object without inserting it -->
<!ELEMENT alias - - (param|alias)*>
<!ATTLIST alias
        id      ID       #REQUIRED  -- defines name for alias --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        >

References

Cascading Style Sheets: W3C's working draft specification can be found at: "http://www.w3.org/pub/WWW/Style/css/draft.html"
HTML and Style Sheets: W3C's working draft specification on associating rendering information with HTML documents can be found at: http://www.w3.org/pub/WWW/TR
Internet Media Types - RFC 1590: J. Postel. "Media Type Registration Procedure." RFC 1590, USC/ISI, March 1994. This can be found at ftp://ds.internic.net/rfc/rfc1590.txt.
MIME - RFC 1521: Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, Innosoft, September 1993. This can be found at ftp://ds.internic.net/rfc/rfc1521.txt.
Client-Side Image maps: "A Proposed Extension to HTML : Client-Side Image Maps", James L. Seidman, 03 Aug 1995. draft-ietf-html-clientsideimagemap-01.txt
The markup language known as "HTML/2.0" provides for image maps. Image maps are document elements which allow clicking on different areas of an image to reference different network resources, as specified by Uniform Resource Locators (URIs). The image map capability in HTML/2.0 is limited in several ways, such as the restriction that it only works with documents served via the "HTTP" protocol, and the lack of a viable fallback for users of text-only browsers. This document specifies an extension to the HTML language, referred to as "Client-Side Image Maps," which resolves these limitations.

The World Wide Web Consortium: http://www.w3.org/pub/WWW/Consortium/